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Abstract: We perform a model-independent fit of the short-distance couphngs g 
within the Standard Model set of 6 ^ 57 and b s££ operators. Our analysis of K*j, 
B K^*^ii and Bg fijj, decays is the first to harness the full power of the Bayesian 
approach: all major sources of theory uncertainty explicitly enter as nuisance parameters. 
Exploiting the latest measurements, the fit reveals a flipped-sign solution in addition to 
a Standard- Model- like solution for the couplings C^. Each solution contains about half 
of the posterior probability, and both have nearly equal goodness of fit. The Standard 
Model prediction is close to the best-fit point. No New Physics contributions are necessary 
to describe the current data. Benefitting from the improved posterior knowledge of the 
nuisance parameters, we predict ranges for currently unmeasured, optimized observables 
in the angular distributions of i? -> K*{^ Kir) U. 
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1 Introduction 



In the course of the last decade, rare i?-meson decays were discovered that are mediated at 
the parton level by flavor- changing neutral-current (FCNC) transitions h ^ sU and b 57. 
They allow one to test Standard-Model (SM) predictions at the loop level and to conduct 
searches for indirect signals of physics beyond the SM (BSM), providing strong constraints 
on the corresponding fundamental parameters, especially in the quark flavor sector. 

The radiative FCNC decay B K*j was first observed by the CLEO collaboration at 
the Cornell Electron Storage Ring [1]. The first-generation B factory experiments BaBar 
[2-9] and Belle [10-13] observed rare radiative and semileptonic FCNC decays of the B 
meson with branching fractions of lO"'^ to 10"''. They measured branching ratios and 
spectral information for a set of inclusive B ^ Xsii and exclusive B K^*^ ii (i = e^ji) 
decays. Recently, additional exclusive B K^*^ fin decay modes were measured by the 
hadron collider experiments CDF [14-16] at the Tevatron and LHCb [17, 18] at the Large 
Hadron Collider (LHC). The complete analyses of the full BaBar, Belle and CDF data sets 
is expected to be published soon. LHCb is about to significantly improve the accuracy of 
measurements of exclusive decays, eventually dominating the other experiments in terms 
of collected numbers of events by the end of 2012. The multipurpose LHC experiments 
ATLAS and CMS are expected to perform similar searches. 

In the last decade D0 [19] and CDF [20, 21] significantly improved the upper bound on 
the branching ratio of the very rare leptonic decay Bg /i/x by several orders of magnitude. 
Currently, LHCb, CMS, and ATLAS continue this search [22-26], and a future discovery 
at SM rates of about 3 x 10~^ is possible with sufficient luminosity [27] . 

Theory predictions of the inclusive decay B Xg i£ have reached the next-to-next-to- 
leading order (NNLO) [28-34]. Contrary to exclusive decays, it only depends on nonpertur- 
bative hadronic matrix elements at subleading order in the Heavy Quark Expansion. But 
the current measurements of its branching fraction are still very uncertain [2, 11] and pro- 
vide only very limited spectral information in the dilepton invariant mass. This situation 
is not likely to improve until the end of the run of the superflavor factories Belle II [35] and 
possibly SuperB [36] around the year 2020. Therefore, it is desirable to include exclusive 
decays in tests of the SM and searches for BSM signals, especially in view of the high num- 
ber of events expected at LHCb. Moreover, the angular distribution of the exclusive decay 
B K*{-^ Ktt) ii with a 4-body final state offers a multitude of optimized observables — 
see [37] for a short summary. 

Both inclusive and exclusive decays are described by the AB-1 effective theory of 
electroweak interactions of the SM and its extensions. They provide constraints on the 
effective short-distance couplings which are known precisely in the SM and are the main 
objects of interest due to their sensitivity to BSM effects at the electroweak scale. 

Exclusive decays typically require the inclusion of final-state-specific nonperturbative 
(hadronic) QCD effects, which complicate the extraction of the short-distance couplings. 
Analyses are further complicated by the background processes b s + {qq) s + U induced 
by 4-quark operators b ^ sqq {q - u,d,s,c). In particular, the narrow J/ip and tp' res- 
onances constitute huge backgrounds to the short-distance-dominated b sM processes. 
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Consequently, theory predictions focus on the q'^ regions below and above both resonances 
and are usually referred to as being in the low- or high-g^ regions, or at large and low 
hadronic recoil. QCD factorization (QCDF) [38-41] or Soft Collinear Effective Theory 
(SCET) [42, 43] is applied at large recoil, E ~ m;,, of the K^*^ system, typically in the 
range 1 GeV^ % % Q GeV^. The expansion parameter A = KjE is of the order A/rrih, 
resulting in a double expansion in A and the QCD coupling constant as- Here A denotes 
a scale associated with nonperturbative QCD dynamics, typically < 500 MeV. The exact 
interpretation is process dependent and specific to the expansion. An operator product 
expansion (OPE) of the 4-quark contributions can be performed [44, 45] at low hadronic 
recoil for q'^ > (14- 15) GeV^ with the expansion parameter A = A/y/q^ ~ A/m^. Moreover, 
form-factor relations [46-49] from the symmetries of QCD dynamics guide the construction 
of observables with reduced hadronic uncertainties in both kinematic regions. 

A large amount of phenomenological studies considering form-factor symmetries have 
focused mainly on the decay B K*(^ Kt:)U. The angular distribution of its 4-body 
final state [37, 50] comprises an order of ten observables that provide complementary in- 
formation at low- and high-g^. Suitable combinations of these observables have also been 
identified that have either i) reduced hadronic uncertainties and possibly higher sensitivi- 
ties to BSM contributions [51-61]; or ii) become short-distance independent, allowing one 
to gain information on form factors [57]. The decay B K ii offers fewer observables, 
some of which are sensitive to scalar and pseudoscalar [62, 63] interactions; and in the 
high-g^ region the same short-distance dependence as in K* ll can be tested [64]. 

We perform a model-independent fit of the short-distance couplings Cj g to the ex- 
perimental data for exclusive B i^*7, B K^*^ Ii, and Bg fifj, decays considering 
the standard set of operators described in more detail in Sec. 2. Improving on our previ- 
ous analyses [57, 59, 64], we include the latest experimental data and add B K*'y and 
Bg -> /i/i, as collected in Sec. 2. We go beyond [65] by including high-g^ data for B K*ii 
and the measurements of B Kii; however, we do not consider inclusive measurements. 
In comparison to the very recent analysis [66], we use updated data and include B ^ K ii, 
but again do not consider the inclusive decays. For our analysis and for all numerical 
evaluations we use EOS [67]. 

Our analysis differs from all previous works [57, 59, 64-66] in its application of Bayesian 
inference with the help of Monte-Carlo techniques to treat theory uncertainties in the form 
of nuisance parameters. The statistical treatment and the choice of priors, as well as the 
determination of credibility intervals, goodness of fit, pull values, and Bayes factors for 
model comparison are described in Sec. 3. In Sec. 4, we present the results of the fit for 
short-distance couplings and discuss those nuisance parameters that are affected by data. 
We present updated predictions based on the fit results for unmeasured, optimized observ- 
ables in the angular analysis oi B K*{-^ Kir)ii in the given scenario. The numerical 
input and details of the implementation of observables and nuisance parameters are sum- 
marized in App. A and App. B, respectively. We also present updated SM predictions in 
App. C. App. D contains definitions of distributions that have been used to model priors. 
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2 AB=1 Decays: Conventions, Observables and Experimental Input 

Rare AB-1 decays are described by the effective theory of electroweak interactions. In the 
SM, the short-distance effects of heavy degrees of freedom of the order of the electroweak 
scale, due to the W and Z bosons and the t quark, are contained in the Wilson coefficients 
Cj. The dynamics of the light-quark (q - u,d,s,c,b) and leptonic {£ = e,//,r) degrees of 
freedom at the scale of the b quark are described by operators Oi of dimension 5 and 
6 for the parton transitions 6 s + (7, g, qq, U). The SM Wilson coefficients Cj {i - 
1,...,10) are presently known up to NNLO (and partially NNNLO) in QCD [28, 68-72] 
and NLO in QED [31, 32, 73, 74]. This includes the renormalization group evolution (RGE) 
from the electroweak scale fiw ~ Mw down to fib ~ mb, the 6-quark mass, which resums 
sizable logarithmic corrections to all orders in the QCD coupling q^. Beyond the SM, the 
effects due to new heavy degrees of freedom can be included systematically as additional 
contributions to the short-distance couplings, possibly giving rise to operators beyond the 
SM with a different chiral nature or additional light degrees of freedom. 

2.1 AB^l Effective Theory 

The effective Hamiltonian of AB-1 decays reads [28, 68] 

- -^Ws (^2 + ) ' - v^,v:jvm, (2.1) 

^2 = c.oi^c^oi + Ec,o„ n^^ = c,{oi - oi) + c^^o'^ - o^) (2.2) 

where Vij denotes an element of the Cabibbo-Kobayashi-Maskawa (CKM) quark-mixing 
matrix, and its unitarity relations have been used. Above and throughout, the Wilson 
coefficients are understood to be MS renormalized and taken at the reference scale /x = 

4.2 GeV.^ In the SM, all CP-violating effects in 6 -> s transitions are governed by 
which is doubly Cabibbo suppressed and leads to tiny CP violation. The operators due 
to b ^ sqq transitions are the current-current operators 0^2, the QCD-penguin operators 
for i - 3,4,5,6, and the chromomagnetic dipole operator i = 8. Effects of QED-penguin 
operators are neglected since they are small for the decays under consideration. Following 
the studies of QED corrections to the inclusive decay, we choose the QED coupling ag 
at the low scale ^6, capturing most effects of QED corrections [31, 32] and removing the 
main uncertainty due to the choice of the renormalization scheme at LO in QED. The 
electromagnetic dipole operator 

^7 - T^^b [sa^uPRb] F'^" (2.3) 

governs 6 57 transitions. The semileptonic operators 

09 = J [si.PLb] , ^ 2 [sTM^Lb] [irid] (2.4) 

^Note that the actual low-energy renormalization scale might differ from /i, and the corresponding 
RGE effect Ci(/^i,) = U{fj,b, IJ')ijCj{fi) should be taken into account in renormalization-scale variations when 
determining the related uncertainty. Throughout we use a central value fit = M- 
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observable 


value 


correlation 






4.55ti±0.34 




[1] 


i3x 10^ 


4.47 ±0.10 ±0.16 
4.01 ± 0.21 ± 0.17 




[7] 
[10] 


S 


-0.03 ±0.29 ±0.03 




[5] 


C 


-0.14 ±0.16 ±0.03 


5% 


s 
c 


-0.32+°:! ± 0.05 
+0.20 ±0.24 ±0.05 


8% 


[12] 



Table 1. Experimental results for CP-averaged 5° K*'^^ observables: branching fraction B 
(CLEO, BaBar, Belle) and time-dependent CP asymmetries S and C (BaBar, Belle), including 
their correlations. Throughout, statistical errors are given first, followed by the systematic errors. 

govern b ^ sii transitions, in combination with less important contributions from Oj. 

In this study, we fit the Wilson coefficients Cj g at the reference scale /i = 4.2 GeV 
using experimental data. We assume them to be real valued and refrain from the frequently 
used decomposition into SM and BSM contributions = ^SM ^^^bsm^ r^^^ Wilson coeffi- 
cients i < 6 and i = 8 contribute numerically only at the subleading level in the observables 
of interest and are fixed to the corresponding SM values at NNLO in QCD. 

Whereas our scenario corresponds to the SM or extensions that do not introduce new 
CP violation nor new operators, more general scenarios have been investigated in the 
literature. The extension of this scenario with complex Wilson coefficients — i.e., CP 
violation beyond the SM — but no additional operators was studied in [59, 64, 66]. An 
extended operator basis with real Wilson coefficients, including chirality-flipped operators 
i - 7' ,9' , 10', has been analyzed in [65, 66]. Finally, the combination of both can be found in 
[66]. Beyond these scenarios, it is conceivable that scalar, pseudoscalar, and tensor b ^ sii 
{£ - e,iJ,) operators can also contribute to the observables under consideration [54, 58, 63]. 
Beyond such direct contributions, additional ones can arise due to operator mixing from 
b ^ sqq operators [75, 76] as well as b ^ stt [77]. 

2.2 Observables and Experimental Input 

Phenomenological studies have analyzed and proposed a large number of CP-symmetric 
and -asymmetric observables. We summarize observables that either have been measured 
and therefore impose constraints on the Wilson coefficients or observables which are i) 
sensitive to the operators of interest and ii) exhibit a reduced hadronic uncertainty. For 
the latter, we compute the ranges that are still allowed by the data within the chosen 
scenario. Throughout, experimental numbers refer to CP-averaged quantities. 

2.2.1 B^K*-f 

For B K*j, several observables have been measured, such as the branching ratio B, 
the time-dependent CP asymmetries S and C, and the isospin asymmetry Aj. Their 
impact on the scenario of real Cy 7, has been studied in [65] using the inclusive B instead 
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g2-bin [GeV2] 


[1.00, 6.00] 


[14.18, 16.00] 


[> 16.00] 






2.05^j]J±0.07 


1.46^41^0.06 


i.o2^[!:g±o.o6 


[8] 


{B) X 10^ 


1.36+[]|?±0.08 


0.38+[]:J^ ±0.02 


0.98![J:f^ ± 0.06 


[13] 




1.41 ±0.20 ±0.09 


0.53 ±0.10 ±0.03 


0.48 ±0.11 ±0.03 


[15] 



Table 2. Experimental results for the CP-averaged branching fraction of charged K^Jin 
decays from BaBar [8], Belle [13], and CDF [15], integrated in bins of . The publicly available 
results of BaBar and Belle are unknown admixtures of charged and neutral B decays. The difference 
between interpreting the data as coming from either purely charged or purely neutral B decays is 
neghgible [64]. 



of the exclusive one. The measurement of i?s 07 can provide similar information and 
allows a third CP asymmetry H to be studied [78]. The angular distribution in the decay 
B iiri( 1270)7 (A'7r7r)7 is sensitive to the photon polarization and tests C^y,; however, 
the feasibility of an analysis remains uncertain [79, 80]. In our analysis we use B and the 
CP asymmetries S and C ol B ^ K*^ with their measurements and correlations compiled 
in Tab. 1, and follow the calculations outlined in [40, 81]. More details on the numerical 
input and nuisance parameters can be found in App. A and App. B. 

2.2.2 B-^KU 

In principle, the exclusive decay B ^ KU with a 3-body final state offers three (CP- 
averaged) observables: the branching ratio B{q^), the lepton forward-backward asymmetry 
AfbCq^), and the flat term Fi{{q^). The latter two arise in the double-differential decay 
rate when differentiating with respect to the dilepton invariant mass and cos 6£ [63] 



2t 

(l-FH)sm'ee + ^FH + AFBCos9e, (2.5) 



d^r 3 „ . . 2. 1 



dr/dg^dg^ dcos Be 4 

where Oi is the angle between the 3-momenta of the negatively charged lepton and the B 
meson in the dilepton center of mass system. Two further interesting observables are the 
rate CP asymmetry ^cp and the ratio of decay rates for the i-e and i-fi modes Rk- ^fb 
is nonzero only in the presence of scalar or tensor BSM contributions, and Fh is helicity 
suppressed by vail\[q^ ™ the scenario under consideration, but is sensitive to scalar and 
tensor contributions [62, 63]. In view of this, available measurements of j4fB) Fu^ and Kk 
are not considered, and we include only the B measurements for one low-g^ and two high-g^ 
bins as listed in Tab. 2. Our theory evaluation at low and high follows [63, 64]. Details 
concerning numerical input and nuisance parameters are given in App. A and App. B. 

2.2.3 B K*{-^ K-k)U 

Phenomenologically, the angular analysis of the 4-body final state B -> K*{-^ Kir) ii offers 
a large set of "angular" observables 

(<^i) Wmm> 9maxJ = J 2 Ji{q ) , i = l,...,9, (2.6) 
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„2 Kin [t~'n\r'il 
q -Dm [LreV J 


[i.UU, D.UUJ 


[14. io, iO.UUJ 


[> iO.UUJ 






2.05_o;48 ± 0.07 


1-46^^:36 ±0.06 


i.02:!:[];4| ± 0.06 


[8] 


{B) X 10^ 


i.4y_Q 40 ± u.iz 


i.Uo_o 26 ± O.Uo 


n/i+0.27 1 n 1 
2.U4_o 24 ± O.ib 


[13J 




1.42 ± 0.41 ± 0.08 


1.34 ± 0.26 ± 0.08 


0.97 ±0.26 ±0.06 


[15] 




1 n 1 n on 1 n on 
Z.iU ± U.zU ± U.zU 


l.UMo ±U.iz^4± U.U/Z5 


1 Qo 1 n 1 1 n nn 

i.oz ± U.io ± u.uy 


[18J 




-0.02lQjg ± 0.07 


-0.31_o;ii ± 0.13 


-0.34_q;|7 ± 0.08 


[9] 


(^fb) 


n 0/^+0.30 . n f\ '1 

-O.ZD_o 27 ± 0.07 


-0.70_o.i6 ± 0.10 


-0.bD_oji ± 0.04 


[13J 


-o.36![J:|^±o.ii 


-o.4o+[!:ig ± 0.07 


-0.66+[]:i^ ± 0.19 


[16] 




U.io ± U.Uo_o 01 


n /lQ+0.06 +0.05 

'^•^^-0.04 -0.02 


n Qn _i_ n nv+0.01 
— U.oU ± U.U / __Q 


ri «i 
[18J 




U.4/ ± U.io ± U.U4 


n A 0+0. 12 .nil 
U.4z_o 16 ± U.ii 


n /I v+O.lS 1 n 1 Q 
0-47_o.20 ±0.13 


rnl 




0.67 ±0.23 ±0.05 


-0.15+^-^1 ±0.07 


0.12+H§ ±0.02 

—yj.Vo 


[13] 

L J 


0-60^[};i ± 0.09 


0.32 ±0.14 ±0.03 


o.i6+[}:fi ±0.06 


[16] 




0.66 ± 0.06^[|:[5| 


n qc;+0.07+0.07 
L'"JC)_o.06 -0.02 


n 07+0.06 +0.03 

'-'••-''-0.07-0.04 


[18] 




1.6!j|±2.2 


0.4 ±0.8 ±0.2 


-0.9 ±0.8 ±0.4 


[16] 


(2S3) 


n in+0.15+0.02 
-0.01 


n n/1+0.15 +0.04 
'-'•^^-o.ig -0.02 


n /17+0.21 +0.03 
'-'■^'-0.10 -0.05 


[18] 



Table 3. Experimental results used for -* K*°U for the CP -averaged branching fraction B, 
lepton forward-backward asymmetry ApB, longitudinal if * -polarization fraction F^, the transver- 
sity observable Ai^^ and {2S3) from BaBar [8, 9], Belle [13], CDF [15, 16], and LHCb [18]. Note 
that the sign of ^fb is reversed due to a different definition of 61 in the experimental community. 



where the boundaries of the bin (throughout in units of GeV^) will not be explicitly 
shown when they are not relevant. Throughout, we assume that the experimental measure- 
ments are given for a certain binning that requires integration for theory predictions. 
Consequently, whenever a (7^-dependent observable X{q^) is defined in a functional form 
-^(9^) - f[Ji]{Q^) ill terms of the angular observables, we define the corresponding g^- 
integrated quantity as follows [57] 



The angular observables ( Jj) are defined in the 3- fold angular distribution 
327r d^(r) 



(2.7) 



(2.^ 



9 dcos 9£ dcos 9k d0 

[{Jis) + {J2s)cos26i + (J6s)cos6l£] sin^6'i^ + [(Jic) + ( J2c) cos 26'<? + (Jgc) cos 6'^] cos^9k 

+ {J3) sin^ 9k sin^0£ cos 20 + ( J4) sm29K sm29£ cos0 + ( J5) sin2^x sin^^ cos (p 

+ {J7) sm29K sin^^sin0 + ( Jg) sin 20^ sin20£ sine/) + ( Jg) sin^^x sm'^9gsm2<f>, 

which accounts for all possible {s . . .b){i . . . I) Lorentz structures of chirality- flipped, scalar, 
pseudoscalar, and tensor operators [54, 58]. The angles are: i) 9i between the and the 
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K* direction of flight in the (i^i^) center of mass, ii) Ok between the K and the K* in the 
{Kit) center of mass and in) (p between the and (Ktt) decay planes [51]. Here the 

normalization of the Jj from [51, 53, 54] is used and differs by a factor 4/3 from [52, 57, 59]. 
The following simplifications arise in the limit mi and in the absence of scalar and 
tensor operators [54, 58]: 

Jis = 3J2s, Jic--J2c, Jqc-O, (2.9) 

and a fourth more complicated relation [56]. It is straightforward to obtain the decay rate 
and the three single-differential angular distributions from (2.8) 

(r) = ^[2{Ju) + (Jlc)] - ][2{J2s) + (J2c)], (2.10) 



d(r) 1 



27r 



[(r) + (J3)cos2(/)+ (J9)sin2(/.], (2.11) 



d(r) 

dcos 9 k 8 

d(r) 3 



= ^[{3{Ju)-{J2s))sm^eK + {S{Jic)-{J2c))cos^0Kl (2.12) 



dcos 9f 8 



2{Ju) + (Jlc) + (2( Jes) + ( Jec)) cos^^ + (2( J2,) + ( Jsc)) cos 2^1 (2.13) 



The branching ratio (B), the lepton forward-backward asymmetry (Ap^), and the 
longitudinal ir*-polarization fraction (-Fl) 

{B} = tbo{T), {Afb} = , {Fl} = , (2.14) 

have been measured by BaBar [8, 9], Belle [13], CDF [15, 16], and LHCb [18]. The angular 

(2^ 

observable (^4^, ) [51] has been measured by CDF [16]; and (Ss) [54] has been determined 
by LHCb [18]: 



^A2)^^AM^ ^S,)JM. (2.15) 

^ ^ ' 2{J2s) ^ ' (T) ^ ' 



All are summarized in Tab. 3. Note that (^fb) and {Fi) are determined from a combined 
fit to the single-differential angular distributions 

(F) ' + 5[l - (Fi)] (1 + cmHt) + (Afb)cos«,. (2.17) 

The observables {A)^') and (Am) = {J9)l{^) 

are determined from 

= 1 + ^[1 - {Fl)] (4')) cos 20 + (Ai^)sin2</., (2.18) 
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implying S'3 = (1 - {Fl}) {a!^^)/2. Note that (2.17) and (2.18) are based on the approxi- 
mation (2.9), which is well justified within our scenario. 

The angular observables (Jj) and the branching ratio (B) are proportional to the 
square of hadronic form factors, the main source of theory uncertainty. In normalized 
combinations of the angular observables, for example ApB ^-nd F^, these uncertainties 
partially cancel. The most prominent example is the position gQ[^FB] of the zero crossing 
of ^FB- A number of suitable combinations have been found for both low- and high-g'^ 
regions. At low q'^ [51, 53, 56, 60, 61] 



.(2), Jj3)_ iaM\_JM_ , ,(im)v _ _(^9)_ 



^ ^ ' \ -2(J2e)(2J2. + J3)' \ T / (2J4)2 + (J7)2' 



(2J4)2 + (J7)2 (4) 



(J5)2 + (2J8)2 



(2.20) 



(5) V(4J2s)^-(J6s)^-4( (J3)2 + (J9)2) 
8{J2s) 



{A't') - ^T^V (2-21) 

whereas at high q'^ [57] 

(//^^0 = ^^===, (2.22) 



^ \/2(J4) 



{4^)). , ^'^^^ (4^^ , ^'^^ ^^^ (2.23) 

V-2(J2c)(2J2s + J3) 2V(2J2s)2-(J3)2 

For brevity, factors of /S^ = ^Jl - 4 m'j / q^ have been set to unity, since they are negligible 
in our scenario for the considered range q^>l GeV^. Recently, it was found that and 
h'^^ are also optimized observables at low q^ [61]. 

We note that at low g^, J3 and Jg vanish at leading order in QCDF [52], making them 
ideal probes of chirality-flipped operators i = 7', 9', 10' because leading terms in QCDF are 
~ Re[CjC*,] and ~ Im[CjC*,]. Jg (and also Jt^) vanishes for real Wilson coefficients, and 
therefore the measurements of {A^^) and (^im) are not of interest for our scenario. Only 
partial results of the subleading corrections exist [40, 81] and only those of kinematic origin 
are included in the numerical evaluation. Nevertheless, (^4^"^) and (253) are included in 
our fit because they might allow us to obtain information on the nuisance parameters used 
to model yet-unknown subleading contributions (see App. B.3). 

At high , Fl and A^^^ become short-distance independent [57] and the experimental 
data allow us to constrain the form-factor-related nuisance parameters; see App. B.2. This 
has been exploited recently [82] to extract the q^ dependence of form factors from data 
and, comparing with preliminary lattice results, to find overall agreement within the still 
large uncertainties. 

In our predictions, we therefore focus on the yet-unmeasured optimized observables 
(4^'^'^'^)) and (4^'^'^)). 
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2.2.4 Bs ^ flfi 

The rare decay Bg flfi is helicity suppressed in the SM, making it an ideal probe of 
contributions from scalar and pseudoscalar operators. Its branching ratio depends only on 



in the scenario under consideration 



4 m}, 4 m^, 



B[B,{t = 0) fLfx] = " "V \VtbK\ \ 1 ^ —T^ \Cio\ (2-24) 

L ; PPJ g^^3 I tb ts\ \^ I 101 \ J 

and is predicted in the SM to be around 3 x 10"^. The main uncertainties are due to the 
decay constant fs^ and the CKM factor |Vj^l4*|. 

Above the mixing of iJ^-meson has not been taken into account, i.e., the branching 
ratio refers to time t - 0. However, experimentally the time-integrated branching ratio is 
determined. Both are related in our SM-like scenario as [83] 

1 AT 

B[B,^i2fi]^- B[Bs{t^O)^flfi], Vs-^. (2.25) 

Lately, the most precise measurement of the life-time difference AF^ became available 
from LHCb [84] and moreover LHCb succeeded to determine the sign of ATg [85] which 
turned out to be SM-like. In view of this, we will use the numerical value from LHCb 
ys = 0.088 ±0.014 [84]. 

In the last decade, the Tevatron experiments D0 [19] and CDF [20, 21] lowered the 
upper bound on the branching ratio by several orders of magnitude to a value close to 
1 X 10"^; and CDF announced the first direct evidence based on a 2 a fluctuation over the 
background-only hypothesis [20, 21]. This year the LHC experiments LHCb, CMS, and 
ATLAS provided their results based on the complete 2011 run [22-26]. In our analysis 
we use the most stringent result B{Bs ^ fin) < 4.5 x 10"^ (3.8 x 10"^) at 95% (90%) CL, 
obtained by LHCb [23]. Details of the implementation of this bound are given in Sec. 3.3. 

3 Statistical Method 

We have decided to use the full Bayesian approach in this analysis. It allows us to in- 
corporate all available experimental results, to obtain probability statements about the 
parameters of interest 9 — the Wilson coefficients — and to compare different models 
using the Bayes factor. 

In the Bayesian approach, we describe theory uncertainties by adding nuisance pa- 
rameters i). It is straightforward to incorporate existing knowledge — say from power 
counting, symmetry arguments, or even other dedicated Bayesian analyses — about these 
theory uncertainties by specifying informative priors. As a cross validation, it is useful to 
employ different priors and compare the posterior inference. Any significant discrepancy 
based on two different prior choices implies that more accurate experimental or theoretical 
input is needed before conclusive statements can be made. This can be seen as a feature 
of the Bayesian methodology. Throughout, we assume that parameters are independent a 
priori, 

p{e,i))^Yim)-UPi^j)- (3-1) 
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The experimental data D are used in the likehhood P(D \ 9, v), and Bayes' theorem yields 
the posterior knowledge about the parameters after learning from the data D 




with the normalization given by the evidence 



Z = 



dedi>p{D\e, D)p{e, d) 



(3.3) 



In case we want to remove the dependence on v in the posterior, we simply marginalize: 



The integrations are performed with the Monte Carlo algorithm described next. 
3.1 Monte Carlo Algorithm 

The presence of multiple, well separated modes, the large dimensionality of the parameter 
space, and the costly evaluation of the likelihood require a sophisticated algorithm [86]. 
We sketch the main steps of this new algorithm: 

1. A sufficiently large number of Markov chains are run in parallel for 0(50000) itera- 
tions to explore the parameter space with an adaptive local random walk. The chains 
need a burn-in phase, thus we discard the first 15% of the iterations. 

2. Chains whose common i?- value [87] is reasonably small, say R <2, are combined into 
groups. 

3. We create patches with a length of 0(1000) points from the individual chains and 
define a multivariate density from the mean and covariance of each patch. 

4. Using hierarchical clustering [88], we combine the patches into a smaller number of 
clusters. As the initial guess for the clustering, we construct a fixed number of about 
30 patches of length 0(5000) from each group of chains. 

5. We define a multivariate mixture density from the output of the clustering by assign- 
ing equal weights to each cluster. This mixture density serves as the initial proposal 
density for the Population Monte Carlo (PMC) algorithm [89, 90]. 

6. Using a computing cluster with a few hundred cores, we draw importance samples 
and adapt the proposal density to the posterior until convergence is achieved; i.e., 
until the difference in perplexity between two consecutive steps is less than 2%. 

7. Given the resulting proposal density, we collect 2 • 10^ importance samples to compute 
marginal distributions and the evidence. 

The biggest advantages of this approach are the automatic adaptation to the complicated 
posterior shape, and the ability to massively parallelize the costly evaluation of the likeli- 
hood. 




(3.4) 
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3.2 Priors 



We use flat priors for tlie Wilson coefficients. This is not done because we want to imply 
complete prior ignorance, but, instead, we want a convenient, sufficiently diffuse density, 
with the expectation that the posterior is dominated by the likelihood. 

For the nuisance parameters, the choice of prior depends on the parameter's nature. 
There are the four quark-mixing matrix (CKM) parameters, the b and c quark masses, the 
decay constant Jb^ entering Bg fifj,, and most dominantly the B K^*^ form factors. In 
addition, unknown subleading contributions in the two distinct A/nif, expansions at large 
and low recoil are parametrized as nuisance parameters. The complete list of almost 30 
nuisance parameters along with the choice of the prior densities is presented in App. B. 
Note that most nuisance parameters only affect a subset of the observables. 

Where possible, the posterior distributions of the nuisance parameters from previous 
analyses fitting different data are used as the prior distributions in our fit. As an example, 
we use the output for the quark masses [91] in the form of LogGamma distributions; see 
App. D.2. 

For the CKM parameters A, A, p, r), we choose the results of the UTfit Collaboration 
[92]. When allowing for BSM contributions in our fit, we use the results of the so-called 
CKM tree-level fit for A, A, p, f/. The tree-level fit represents only the basic constraints 
from SM tree-level processes, which every extension of the SM must include and we assume 
that BSM contributions are negligible. Thus no information from rare B decays and B~B 
mixing is used indirectly through the priors. For the SM predictions, we use the results 
of the SM-CKM fit instead. The posterior distributions from either fit are assumed to be 
symmetric Gaussian distributions, which was found to be a good approximation. 

We model theory uncertainties with Gaussian distributions in cases where authors only 
report an estimate of the magnitude. This is justified by the principle of maximum entropy 
[93]. As an example, suppose the quoted uncertainty of a QCD form factor / is 15%. We 
introduce a nuisance parameter, (^j, as a scaling factor, such that / C/ ■ /? ^-^id we vary 
C/ ~ A/" (/i = 1, CT = 0.15), with an allowed range C/ ^ [1 - 3 ■ cr, 1 + 3 • cr]; i.e., neglecting the 
tails of the Gaussian beyond 3(T. If necessary, we modify the range to avoid unphysical 
values of /. Subleading phases are incorporated with flat priors covering the full range. 

3.3 Experimental Results 

We form the total log likelihood, logP(L' \ 9,i'), by summing over the individual contribu- 
tions. The complete list of experimental results used is given in tables 1 to 3. The majority 
of results is incorporated as 1-dimensional Gaussian distributions, whose variances are ob- 
tained by adding statistical and systematic uncertainties in quadrature, = '^Itat ^ '^syst- 
In the case of asymmetric uncertainties, we use a piecewise function constructed from two 
Gaussian distributions around the central value with different variances. With the excep- 
tion of the upper bound on B{Bs /i^), we limit ourselves to the Gaussian distribution 
in the likelihood, despite the discontinuities arising from asymmetric uncertainties, to ob- 
tain results that are comparable with the existing literature. Known correlations between 
observations, e.g., between the time-dependent CP asymmetries S and C in B 
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are represented by multivariate Gaussian distributions. Note that in B ^ K*ii decays the 
observables ^fb and Fl are extracted from a simultaneous fit to the double differential 
decay rate. Requiring physical values, i.e., 



dq"^ dcos 9e^K 

cuts out an allowed region in the (^fB; Fl) - plane. Since the fit for (Afbi Fl) typically 
converges near the unphysical region, the resulting contribution to the likelihood would 
be distinctly non-Gaussian. Unfortunately, the resulting 2D likelihood is not publicly 
available, thus we assume ^fb and F^ independent and Gaussian distributed. 

We include the results of direct searches for the decay Bs fifj- into the likelihood. 
Often, only the 90% and 95% limits on the branching ratio B, obtained with the CLs 
method [94], are published. However, there is no single best way to translate these limits 
into a useful contribution to the likelihood, and several schemes of varying sophistication 
exist in the literature [95-97]. It is preferable to directly use the Bayesian posterior on 
the branching ratio P(B\D), computed by a general algorithm for multichannel search 
experiments [98]. This posterior is often produced to compute Bayesian limits for cross 
checks with CLs results. The input numbers — expected signal yields, background yields 
— that are needed to compute P(B\D) are publicly available from LHCb [23]; only the 
correlations of the yields are not published. By reinterpreting the function P(B \ D) as 
P(B I 6, u) the desired contribution to the likelihood is found. For a convenient approx- 
imation to P{B\D), we use the four-parameter Amoroso-distribution [99] with relevant 
details given in App. D.l. For the data supplied by LHCb [23], the relative error of this 
interpolation is at most 2%. 

3.4 Uncertainties of Theory Predictions 

Within the Bayesian framework, the procedure to calculate the uncertainty of an observ- 
able's prediction within a given theory, say the SM, is essentially uncertainty propagation. 
In this case, an observable A depends on Wilson coefficients and on additional nuisance 
parameters. We fix the values of the Wilson coefficients, 6 - Osmi so the value of A is 
uniquely determined by i.e., A - /(P). We vary the nuisance parameters according to 
their prior, P{v). The distribution of the random variable A, P{A)^ is given by 

P{A)^ J di)P{A,v)^ j duP{A\v)P{v)^ J di)6{A-f{i)))P{i)), (3.6) 

where we used the Dirac ^-distribution. Numerically, one only needs to draw parameter 
samples Ui ~ P{iy) and calculate A for each sample Pj. We collect the resulting samples 
Ai in one dimensional histograms to extract 68% intervals. As before, we assume P{i') = 
Ylj P (i^'j) and use the priors listed in App. B. As a welcome side effect, this form of P^i') 
allows us to efficiently sample P from the joint prior by sampling from simple, ID priors 
directly without the need to resort to MCMC or PMC. 

If we take this approach one step further, we can ask: what are the likely values, or 
informally speaking "the allowed ranges," of A, given the set of measurements D listed in 
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Section 2? Using the full posterior on both Wilson coefficients and nuisance parameters, 
P{9, D I D), we obtain 

P{A\D)^ J d9di)6{A- f{9,i>))P{9,i)\D). (3.7) 

We simply take the posterior samples produced by the PMC algorithm, compute Ai for each 
sample (9, and finally fill the sample Ai with its importance weight into a histogram. 

3.5 Goodness of Fit and Model Comparison 

To check that the assumed model with three real Wilson coefficients provides a good 
description of the experimental observations, we determine the goodness of fit. We follow 
the standard procedure: first we choose a test statistic T{D \ 9, D) with the parameter 
values chosen at a local mode of the posterior, then calculate its distribution, and finally 
determine the value of the test statistic for the actual data set. For more details on p- values 
and how we interpret them in this work, we refer to [100]. We make two closely related 
choices for T, defined as follows. 

For each observable x, we compare its theory prediction Xpred{9, i') with the mode 
of the experimental distribution (central value) of x, denoted by x* . We compute the 
frequency / that a value of x less extreme than Xp^ed would be observed. Using the inverse 
of the Gaussian cumulative distribution function, <I>"^( ■ ), we define the pull: 



(3.8) 



Note that for a 1-dimensional Gaussian, this reduces to the usual 6 - {x* - Xpred)/(^- Ii^ 
the 1-dimensional case, the (Gaussian, Amoroso) distributions yield a signed 5 (positive if 
X* > Xpred, else negative), while for the multivariate Gaussian, 6 is positive semidefinite. 
We define the test statistic Tpuu as 

i 

where i extends over all experimental data. As a cross check, we also consider Tukej defined 
as the value of the log likelihood, Tii^e = ^ogP{D\9, D). Its frequency distribution is 
approximated by generating 10^ pseudo experiments D ~ P{D\9* ,1)*), where 9*, 9* are 
fixed values at a local maximum of the posterior. Since we do not have the raw data — 
events, detector simulations etc. — available, we generate pseudo experiments. Consider 
the case of a single measurement with Gaussian uncertainties - x*,a): we fix the 

theory prediction, shifting the maximum of the Gaussian to Xpred{9* but keep the 
uncertainties reported by the experiment. Then we generate x ~ M{fi - Xpredi9* ,i)*),a), 
and proceed analogously for all observables to sample D. The p- value is computed by 
counting the fraction of experiments with a likelihood value smaller than that for the 
observed data set and corrected for the number of degrees of freedom; see Section III.D.5 
in [100]. Although the generation of pseudo data is far from perfect, we emphasize that, 
on the one hand, it is fast and, on the other hand, we will not consider the actual value of 
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p too rigorously. Two models with p-values of 40% and 60% both describe the data well, 
and that is all the information we need. 

If we used the maximum likelihood parameters and ignored the Bg /i^u contribution, 
both statistics would be equivalent to ^-iid thus yield the same p-value. The parameter 
values at the global mode of the posterior differ only little from the maximum likelihood 
values, and Bg flfJ, presents only one of 59 inputs. We therefore consider it reasonable to 
approximate the distribution of Tp^u by the x^-distribution in order to compute a p-value. 

If there are several local modes with reasonably high p- values, it is necessary to assess 
which of them is favored by the data; i.e., to perform a model comparison. Suppose the full 
parameter space is decomposed into disjoint subsets Vi, i - 1 . . .n, where Vi contains only 
a single mode of the posterior. Then we compute the local evidence Zi by integrating over 
Vi in (3.3). In fact, the integral is available as the average weight of all importance samples 
in Vi, with an accuracy of roughly 5%. The Bayes factor Bij - Zi/Zj is the data-dependent 
part in the posterior odds of two statistical models Mj , Mj : 

PiMAD) mi (3.10) 

P(M,\D) " P(M,) 

In addition to the local evidence, we compute the evidence Zsm for the SM case with fixed 
values for the Wilson coefficients, but with P allowed to vary. Computing the Bayes factor 
with Zsm and the local evidence for the region with SM-like signature allows us to assess 
if the data are in favor of adding three degrees of freedom for the Wilson coefficients to 
achieve a better agreement with the theory predictions, or if the SM is preferred due to its 
simplicity. 



4 Results 

In the following, we discuss the results of our analysis of the experimental data described 
in Sec. 2.2 using the statistical tools explained in Sec. 3. First, the results of the global fit 
of the three Wilson coefficients and 28 nuisance parameters to 59 experimental inputs are 
presented, including marginal distributions and best fit points. As for goodness of fit, we 
list p- values and evidence for each of the arising solutions. Furthermore, we show pull values 
for the included measurements. And we discuss the fit results of nuisance parameters, if 
the posterior differs significantly from the prior. Second, we provide predictions of yet- 
unmeasured, optimized observables in i? K*{^ K7r)££ at low and high within our 
scenario, taking into account the experimental data. Finally, we give SM predictions for 
measured and yet-unmeasured observables including theory uncertainties determined using 
the Monte Carlo method as explained in Sec. 3.4. 

4.1 Fit Results 

Here, we summarize the main part of our work: the results of the fits of the parameters of 
interest, the Wilson coefficients C^g iq, to the data listed in Sec. 2.2. Details of the Monte 
Carlo algorithm are given in Sec. 3.1. The treatment of prior distributions is explained in 



- 15 - 



15 








-15 I I 

-1.0 -0.5 0.0 0.5 1.0 



15 



10 



5 



-5 




-10 



-15 I I 

-1.0 -0.5 0.0 0.5 1.0 



Figure 1. The marginalized 2-dimensional 
95% credibility regions of the Wilson coefB- 
cients g j^q for fjL = 4.2 GeV are shown when 
applying the B K*^ constraints in com- 
bination with i) only low- and high-q^ data 
from B ^ K ££ [brown]; ii) only low-g^ data 
from B K*££ [blue]; Hi) only high-g'^ data 
from B K*U [green]; and iv) all the data, 
including also Bs ^ p-iJ- [light red], showing 
as well the 68% credibility interval [red] . The 
SM values Cf^^g are indicated by ♦. 
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Sec. 3.2, the priors of the nuisance parameters are specified in App. B. For C^g ^q, we use 
flat priors with 

C7 6[-l,l], C9,ioe[-10,10]. (4.1) 

The fit not only constrains the Wilson coefficients C^giQ, but updates our knowledge of 
the nuisance parameters. We discuss the most significant changes. 

The marginalized two-dimensional 95% credibility regions are shown in Fig. 1 when 
applying the B K*^ constraints (Sec. 2.2.1) in combination with i) only low- and high-g^ 
data from B -> KM (Sec. 2.2.2); ii) only low-g^ data from B K*U ^; Hi) only high-g'^ 
data from B K*U (Sec. 2.2.3); and finally iv) all the data, including also Bs 
(Sec. 2.2.4). 

^Here we enlarged the prior ranges of £7,9 lo by a factor of 2, which is irrelevant for the remainder of 
our work. 
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The most stringent constraints on Cg iq come from the high-g^ data of -> K*££, which 
should be taken with some caution since the form factors are only available as extrapo- 
lations of LCSR results from low g^. In the near future, we expect more accurate lattice 
calculations of form factors to close this weak point. Also shown are the SM predictions of 
C^ g = 4.2 GeV) using NNLO evolution [31]. 

We confirm the findings of previous analyses [64, 66] that only two solutions make up 
95% of the probability: the first exhibits the same signs of Cj g as the SM. The second 
solution corresponds to a first order degeneracy of all observables under a simultaneous 
sign flip C-jgiQ -> -C^g^Q. There are two additional local maxima that correspond to a 
sign flip of Cj -C7 of the former solutions. In Tab. 4, we list the properties of these four 
modes, categorized by the signs of C7 g ^g- As witnessed by the evidence, Z, the SM-like and 
sign-flipped solutions essentially make up the whole posterior mass, with ratios of 52% and 
48%, respectively. The other two solutions are suppressed by many orders of magnitude, 
and thus do not appear at the 95% level. For the two dominant solutions, the goodness- 
of-fit results are nearly identical: both p-values based on the statistics Tii^g and Tpuu (see 
Sec. 3.5) are large, indicating a good fit. In contrast, the suppressed solutions do not seem 
to explain the data well. We note that the MCMC revealed a handful of additional modes 
with 6 < |Cg ^ol ^ 9. We do not consider these further because they are suppressed by a 
factor of roughly exp(40) compared to the global maximum. 

To highlight the fit results, we present the pull (3.8) — the normalized deviation be- 
tween theory prediction and measured value in units of Gaussian a — for all 59 constraints. 
Pulls for B K*'y [left] and B K U [right] constraints are shown in Fig. 2; those for 
B K*U in Fig. 3; and the pull for LHCb's result of Bg /i/U is -1.1; i.e., its most 
likely value from the measurement is about la (in terms of the experimental uncertainty) 
lower than the theory prediction. Here, the theory parameters are chosen at the global 
maximum of the posterior. With the best-fit parameters with Cj g fixed at SM values 
(see below), we obtain nearly identical plots, and we therefore omit them. We observe the 
largest pull at +2.5 for the Belle measurement of (i3)[16, 19.21] for B -> K*U. It is the 
only pull surpassing 2.0. Fig. 3 shows, for example, how the debate about the existence 
of a zero crossing of A-p-Q at large recoil was settled: the first published measurements by 
Belle and CDF deviated from the SM prediction, but when taken together with LHCbs 
recent result that pulls the best fit point towards the SM, there is good agreement between 
the SM and the experiments. 

We also perform the global fit with C7 g fixed to the SM values, varying only the 
nuisance parameters; see the bottom row in Tab. 4. The prior normalization then changes 
by log(800) = 6.68 due to omitting Cy g with ranges given in (4.1). The values of Tiike and 
7]puu are just as good as for the two dominant solutions, but the p-values are even larger, 
as the number of degrees of freedom used in the x'^-distribution to calculate p differs by 
three. We compute the Bayes factor of the SM vs the SM-like solution by dividing their 
respective evidences: 

= exp(392.6- 385.3) 1500. (4.2) 
Assuming prior odds of one, the posterior odds are given by i3, and thus are clearly 
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sgn(C7, Cg, Cio) 


best-fit point 


log(MAP) 


goodness 

Tlike Plike 


, of fit 

T'puU PpuU 


log(Z) 


(-,+,-) 
(+,-,+) 
(-,-,+) 
(+,+,-) 


(-0.293,3.69,-4.19) 
(0.416,-4.59,4.05) 
(-0.393,-3.12,3.20) 
(0.558,2.25,-3.24) 


425.22 
425.08 
404.67 
400.91 


402.59 60% 
402.49 60% 
387.88 0.9% 
384.52 0.2% 


48.4 75% 

48.5 75% 
76.5 4% 
83.1 1% 


385.3 
385.2 
363.9 
358.9 


SM: (-,+,-) 


(-0.327,4.28,-4.15) 


431.46"^ 


402.53 70% 


48.5 83% 


392.6 



Table 4. Best-fit point, log maximum a-posteriori (MAP) value, goodness of fit summary and 
log evidence for the four local modes (denoted by the signs of (Cy, Cg, Ciq)) of the posterior including 
all experimental constraints. The renormalization scale is fixed to fi = 4.2 GeV. For comparison, 
we include the case with (Cy, Cg, Cj^g) fixed at the SM values for which only nuisance parameters 
are varied (denoted by SM). The nuisance parameters are discarded when counting the degrees 
of freedom to compute the shown p- values [%] based on the statistics Tuke and Tpuii- ^ When 
comparing the posterior of the SM with the other modes, it has to be noted that the prior volume 
of (Cy, Cg, Ciq) is 6.68 in log units. 
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Figure 2. Pull values for observables m B ^ K*^ [left] and B ^ K ££ [right] calculated at the 
best fit point. The pull definition for the correlated observables S and C permits only S > 0; for 
details see Sec. 3.5. 

in favor of the simpler model. The effect persists if we cut the allowed range of each 
Cj in half to exclude all but the SM-like solution. In conclusion, both the SM (with 
nuisance parameters allowed to vary) and our extension with real floating Cj g iQ fit the 59 
experimental observations of rare B decays well. Since the extension does not provide any 
significant improvement, the simpler model should be preferred. 

To study the dependence of our fit results on the priors, we use a second set of priors 
{wide priors). We scale the uncertainties of those parameters associated with form factors 
and unknown subleading contributions in A/mf, (Tab. 12) by a factor of three and adjust 
the parameter ranges accordingly. All other priors are kept the same. This choice includes 
the major sources of theory uncertainty and represents a pessimist's view of a) the validity 
of form factor results based on LCSR at low q^, b) their extrapolation to high values, 
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Figure 3. Pull values for observables in B ^ K*£i calculated at the best fit point. 





C, 






68% 
95% 

max 


[-0.34,-0.23] u [0.35, 0.45] 
[-0.41,-0.19] u [0.31, 0.52] 
-0.28 u 0.40 


[-5.2,-4.0] u [3.1, 4.4] 
[-5.9,-3.5] u [2.6, 5.2] 
-4.56 u 3.64 


[-4.4,-3.4] u [3.3, 4.3] 
[-4.8,-2.8] u [2.7, 4.7] 
-3.92 u 3.86 


68% 
95% 
max 


[-0.39,-0.19] u [0.30, 0.48] 
[-0.53,-0.13] u [0.24, 0.61] 
-0.30 u 0.38 


[-5.6,-3.8] u [2.9, 5.1] 
[-6.7,-3.1] u [2.2, 6.2] 
-4.64 u 3.84 


[-4.0,-2.5] u [2.6, 3.9] 
[-4.7,-1.9] u [2.0, 4.6] 
-3.24 u 3.30 



Table 5. The 68% and 95% credibility intervals and the two local modes of the marginalized 
1-dimensional posterior distributions of the Wilson coefficients C7 g M = 4.2 GeV for nominal 
[upper] and wide [lower] ranges of nuisance parameters (see App. B). 



and c) subleading corrections exceeding expectations from power counting. The results 
of the fit at the low scale = 4.2 GeV to all data with these new priors is shown in 
Fig. 4 alongside the corresponding 68%- and 95%-credibility regions of Fig. 1 for the two 
solutions in each of the three planes Cj-Cg, C-j-C^q and Cg -C^q. Most importantly, the fit 
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Figure 4. The marginalized 2-diniensional 68% (and 95%) credibility regions of the Wilson co- 
efRcients C7 g at /i = 4.2 GeV for the SM-like [top row] and sign-flipped solution [bottom row], 
arising from nominal ranges of Fig. 1 [red and light red, respectively] and wide ranges [solid and 
dashed contours, respectively] of the nuisance parameters. We indicate the values of C^^^q in the 
SM [♦] and at the local maximum of the posterior [X] resulting from nominal prior ranges in the 
respective region. 

is stable and gives comparable results with both sets of priors thanks to the large number 
of experimental constraints. In all six planes, the area covered by the 68% region with 
wide priors is similar to that of the 95% region with nominal priors. While the two sets of 
regions are concentric in the C7 -Cg plane, there appears a rather hard cut-off at \Ciq\ ~ 5 
in the Cjg ~ Ciq planes. For completeness, we list the set of smallest intervals and local 
maxima derived from the one-dimensional marginalized distributions for Cj g for both 
sets of priors in Tab. 5. Our results for the 95% credibility intervals are compatible with 
those of Ref. [66]. More specifically, we find a larger interval for C^, covering smaller values 
of \Cy\- This is due to the use of 5 Xs^ constraints that are used in Ref. [66], but not 
included in our work. However, with regard to Cg ^g, we find that our credibility intervals 
are 10-40% smaller. Compared to Ref. [66], we have added the 2012 results by LHCb and 
BaBar. The question arises if the inclusion of the inclusive decays B Xg'j and B XgU 
could further shrink the Cg credibility intervals. 

From the allowed ranges for C^g^^Q, we can estimate limits on the scale of generic 
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A^P [TeV] 


ANP [TeV] 


AfP [TeV] 


SM-like 


29, 38 


28, 37 


30, 44 


SM-sign-flipped 


12, 13 


11, 13 


12, 13 



Table 6. Constraints on the NP scale Af^ {i = 7,9, 10) assuming generic flavor violation at tree 
level using the 95% credibility region from Tab. 5. Several possibilities arise from destructive and 
constructive interference of the SM with SM-like and SM-sign-flipped solutions. 
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/+(0) bt 

Figure 5. Prior [dotted] and posterior distributions of the nuisance parameters /+(0) [left] and bl_ 
[right], governing the normalization and the shape of the B ^ K form factor f+{q^), respectively. 
We show the posterior using B KM data only [dashed] vs all data [solid] . 

flavor-changing neutral currents at tree level, described by 



^eff^ E 7T^> (4-3) 
1=7,9,10 V\ J 



Or = rub [sa^uPRb] F''" , Og.io = [s7^Pl6][V(1, 75)^] • (4.4) 

Using Cj = Cf^ + Cf^ and setting to the boundary values of the 95% intervals (nominal 
priors), we extract . By matching (4.3) with (2.1) and (2.2), we extract the minimum 
scale A^P for both constructive and destructive interference with the SM; see Tab. 6. The 
resulting scales above which NP "is still allowed" are similar to those found in previous 
analyses [64] and [66]. 

So far, we discussed the fit results for the Wilson coefficients g that enter most, but 
not all of the observables. Exceptions are those of B K*^, which depends only on Cy, and 
Bs p-fj., which depends only on Ciq. The marginalized distributions in the Cg -Ciq plane 
of Fig. 1 show that, compared to B ^ K* , the fit with B ^ K only measurements prefers 
a smaller value of |Cgp + |C;^qP; the marginal modes (not shown) are near Cg — 0, C^q — ±5. 
Since the B -> K* constraints dominate the combination, a "tension" arises. 

Let us now discuss the role that the nuisance parameters play in the fit. First, we note 
that the posterior distributions of the common nuisance parameters — those that are not 
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specific to rare 6 s decays, like the CKM parameters and the c and b quark MS masses 
— do not deviate from their prior distributions given in Tab. 10. This is mainly due to 
the strong prior knowledge from other measurements and the comparatively low precision 
of both experimental and other, mostly hadronic, theory inputs in the rare 6 s decays. 

Second, we consider the remaining hadronic nuisance parameters of form factors and 
subleading corrections, for which the priors are based mostly on educated guesses rather 
than precise knowledge. Because B K and B K* form factors enter observables 
at both low and high q^, they are determined by all the B ^ Kit and B K*{^, U) 
observables respectively. In contrast, the parametrization of unknown subleading A/nib 
corrections is different at low and high (and naturally in B ^ K and B K* decays). 
Since subleading corrections at high receive further parametric suppression by either 
Cy/Cg or as [44, 57], the corresponding observables at high q'^ are rather weakly dependent 
on them. In contrast, at low q"^ large effects are not surprising. 

Therefore, we expect a significant update to knowledge of form factors to accommodate 
the tension between B K and B -> K* constraints. Any remaining tension should be 
visible in low-g^ subleading corrections. 

Let us first consider the posterior distributions of the two nuisance parameters /+(0) 
and bi, which enter the q"^ parametrization of the B ^ K form factor f+{q'^) (see (B.3) and 
priors in Tab. 12 from LCSR results [41]). The shape of the form factor is controlled 
by The low- and high-g^ data of the B ^ Kii branching fraction (Tab. 2) give rise to 
a narrower posterior compared to the prior distribution in Fig. 5, which does not change 
much when using only B ^ K ii data or combining it with B -+ K*iL This preference also 
appears when choosing the wide set of ranges for the prior distributions of the nuisance 
parameters, demonstrating that the data suppress the tails in the prior of 6]^. Concerning 
/+(0), which corresponds to the normalization of the form factor, we observe a strong 
preference for low values in the posterior distribution in Fig. 5. However, this preference 
almost disappears when only B K U data is used in the fit. This behavior persists 
even when allowing for wider prior ranges, and is easily understood in terms of the above- 
mentioned tension. 

We also find strong modifications of the posterior with respect to prior distributions 
for the three scale factors C,Ax,A2y entering the form factors Ai,A2,V m B ^ K* . The 
posteriors are shown in Fig. 6 along with the common prior distribution. Of the three, 
Ai is known most accurately after the fit, while A2 and V are simultaneously shifted and 
compressed. Using all constraints, Ai,A2,V are shifted towards higher values, but without 
B ^ K constraints, the shift actually points in the opposite direction. Again, the positive 
shift serves to reduce the tension and allows a good fit to all constraints with values of 
Cg iQ smaller than required by the B K* constraints alone. 

The parameters describing subleading phases are mostly unaffected by the fit. All 
phases come out with a fiat distribution, indicating that they could have been omitted 
from the fit without any consequences. 

The largest update to knowledge of subleading parameters occurs for the scale factor 
of the transversity amplitudes Aq ^ (B.3) describing the B K* decays, with a downward 
shift of about 10% and a slight reduction of variance. We observe this effect only in 
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g^-bin 




(4^) 


(A?) 


[1.0, 6.0] 


n AriA +0.081 +0.181 
U.^d^ -0.086 -0.158 


n +0.156 +0.355 
U.JUd _o.l21 -0.234 


n /IfiS +0.019 +0.030 
U.^DO _o.025 -0.056 


g^-bin 








[1.0, 6.0] 


n oq +0.14 +0.25 
^•"-•"J -0.10 -0.22 


n AA^ +0.055 +0.105 
U.^^i _o.058 -0.113 


n 971 +0.057 +0.117 
^■^1 -0.060 -0.117 



Table 7. Predictions of unmeasured, optimized observables based on global fit output integrated 
over the large recoil region. We list the most probable value, the smallest 68% and 95% intervals. 

the fit with all observables. Neither nor B ^ K subleading parameters are updated 
significantly in any of the fits. Af" has little effect compared to A^ because the observables 
depend on A^'^ oc Cg ^C^q, and Cg ~ -Cio- 

In summary, we do not observe a drastic update of any nuisance parameter, showing 
that the fit is stable ^. The uncertainty on the form factors and some subleading corrections 
is reduced by the data, but the most likely values are shifted due to the tension between 
B ^ K and B ^ K* constraints. More theory input is required to reduce the uncertainty 
on the remaining subleading corrections. 

4.2 Predictions 

As outlined in Sec. 2.2.3, the angular distribution of B ^ K*(^ Kir) U allows one to form 
optimized observables, which have reduced form factor uncertainties and may exhibit sensi- 
tivity to a particular type of new physics. Currently, no measurements of these observables 
are available. We provide predictions at low and high g^ within the scenario of the SM 
operator basis, taking into account the present data. Consequently, future observations 
outside the predicted ranges would indicate physics beyond the considered scenario. 

The predictions of ^4^'^'^''^'^^ and h!^''^^ at low g^ are given in g^-integrated form for 
the bin e [1,6] GeV^ in Tab. 7. In addition. Fig. 7 shows the results of the 5 sub- 
bins with a bin width of 1 GeV^, as used in the first measurement of the lepton of 
B K*li by LHCb [18]. The observables Ap^^ have been chosen due to their sensitivity 

^For the suppressed solutions, scale factors for B K* form factors and shift by 0(15%) and 
even peaks at the left boundary. 
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Figure 7. Predictions of unmeasured optimized observables at large recoil based on the global fit 
output. We show the most probable value [solid black line] as well as the smallest 68% (green) and 
95% (yellow) intervals of the g^-integrated observables. 



q^-h'm 



[14.18, 16] 
[16,19.21] 
[14.18,19.21] 



n QQQfiq +0.00009 +0.00015 

u.yyyoy _o.oooii -0.00026 

n QQSQR +0.00025 +0.00044 

u.yyoyo _o.ooo32 -o.ooo76 

n 00779 +0.00058 +0.00105 
U.jy/ iZ, _o.00078 -0.00179 



n QS/1'? +0.0023 +0.0056 
-U. JO^zO _o.0022 -0.0039 

n 070/1 +0.0018 +0.0042 
-U.jru^ -0.0019 -0.0037 

n Q7QQ +0.0027 +0.0057 
-U.yrOO -0.0023 -0.0043 



n QS'?7 +0.0022 +0.0053 
-U.yOO^ -0.0019 -0.0033 

n QR1/1 +0.0015 +0.0037 
-U.yOl^ -0.0012 -0.0021 

n QfiflS +0.0019 +0.0045 
-U.yOUO _o.0015 -0.0027 



Table 8. Predictions of unmeasured, optimized observables based on global fit output for the two 
conventional bins and the entire low recoil region. We list the most probable value, the smallest 
68% and 95% intervals. 

to the chirality-flipped Cj [53]. The large discontinuity of in e [1, 3] GeV^ is caused 
by the zero crossing of J4 in its denominator (2.20). The observable Aj, is restricted by 
construction to take values in [-0.5, 0.5] and reaches its maximal value at the zero crossing 
of the lepton in the bin e [4, 5] GeV^ [56] . Its shape is sensitive to new physics 
contributions of the Wilson coefficients. Note that the theory uncertainty is at a minimum 

(5) 

when A)p approaches 0.5. 

The observable A^^^ reaches its maximal value of about 1.0 in e [2, 3] GeV^ and has 
the very same zero crossing as the leptonic A-p-Q. Our results are in qualitative agreement 
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with [60], who stressed that the deviation of the maximal value from 1.0 and its position 
are sensitive to new physics. The observables H^'^^ were first proposed for the high-g^ 
region [57] as long-distance free observables. In addition, H!j}^^ is also short-distance free, 
with \H!j}\q'^)\ - 1, depending only on the sign of a form factor. Recently it was shown 
that at low g^, form factors also cancel in Hlj}'"^^ [61]. Each has a zero crossing in the region 
e [1,3] GeV^ that is the very same as in the CP-averaged normalized observables J4,/T 
and Js/r [54, 55]. For one observes the rise towards 1.0 for rising q . 

At high-g'^, the situation is more restrictive, and within the scenario of the SM op- 

f 1 2 3) 

erator basis, there are only three optimized observables Hj^ ' ' [57]. The predictions for 
three q'^ bins are given in Tab. 8. Besides \H^\q'^)\ = 1, we have the additional relation 
H^\q^) = H^\q^). Small deviations in the predictions of {H^''^'^^) arise from separate 
g^-integration of Jj (see (2.7) and below), such that the equality does not hold exactly. Any 
large experimental deviation from the prediction \H^\q'^)\ - 1 would signal a breakdown 
of the OPE [101]. The observables H^'^\q'^) are given by the short-distance ratio [57] 



jj(2,3)/ 2x 2Re[qf(g2)C*o] / . 2x \ 2r 

[Q )^ , „ ^ ,2 - = cos (9979(9 )- (^10 j ^ 4.5 

r'cff/„2\ r , I |2 ' 1 + 



with 



CfA^) - Cfii) ^ .^Cf (g^), rii) - (4.6) 

^ I ^10 I 

and Cl^{(p') and the factor K = l + C'(Q;s)of the improved Isgur-Wise form factor relation 
defined in [57]. In the SM, ~ -4.2 and therefore its phase ifiQ - vr. The q^ dependence 
of the sum of the effective Wilson coefficients C^g (g^) is rather weak and its imaginary parts 
at NLO in QCD small [59], such that ^i^iq^) ^ 0; whereas the magnitudes of the Wilson 
coefficients are ~ +4.2 and ~ -0.3, and lead to r rs 1 and cos (¥'79(9^) - V'lo) ~ ~1- 
Therefore, Hj. ' test roughly the ratio of | Cg |/| C^^q | within our scenario of the SM operator 
basis and real Wilson coefficients. The results in Tab. 8 show that current data do not 
allow for deviations from the SM prediction. We remark that the prediction of {H!j}^) is 
based on the OPE and is expected to be 1 at any particular value of q'^. Therefore, our 
results just reflect how precisely the form factor and the modeled subleading corrections 
cancel for the g^-integrated version when taking into account the update of our knowledge 
of the nuisance parameters due to the experimental information. 

Although SM predictions have been given previously [54, 64, 66], our Monte Carlo 
approach described in Sec. 3.4 provides several improvements with respect to the standard 
procedure to estimate theory uncertainties. Usually observables X(i^) are computed at 
three values of a single parameter u: at the central value fcen and at (i/cen)-a- The changes 
in the predictions of X are then interpreted as the associated uncertainty: cj+ _ = |X(t'ccn)~ 
X{i'cen ^a)!) and the central value of X is assumed to be X(i/ccn)- In the presence of several 
parameters, the respective uncertainties are then combined either linearly or in quadrature 
into a total uncertainty. In contrast to this so-called min-max approach, we vary all 
parameters at the same time and thus automatically take correlations into account. Our 
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Figure 8. Probability distributions of the SM predictions of g^-integrated observables in the 
B ->■ K*i£ decay, when varying nuisance parameters within their allowed prior ranges [solid, blue]. 
The shaded region is the 68% interval and the vertical (red) line indicates the prediction when 
using central values of nuisance parameters. Also shown are the predictions based on posterior 
distributions [dashed, green] determined by the experimental data, allowing also for NP in g iq 
in the fit. 



intervals have a strict probabilistic interpretation as Bayesian credibility intervals, and the 
procedure automatically takes care of non-linearities. As a simple example consider the 
quadratic dependence of a branching ratio i3 on a decay constant or form factor 

/. Assuming a Gaussian prior distribution of /, p{B) is the (asymmetric) x'^-distribution 
with one degree of freedom. Typical examples of asymmetry can be seen in in Fig. 8 for 
(;B)[1,6] and (F/;,)[l,6] (blue, solid) of the decay B K*££, where the maximum of the 
distribution deviates from the vertical (red) line that indicates the prediction obtained by 
using central values for all nuisance parameters; i.e., the position of the maxima of their 
priors. This behavior is not present in 6] since there, form factors cancel; likewise 

in {H!j}^)[lA.18, 16]. We list the modes and 68% intervals for a number of observables in 
App. C in Tab. 13 and Tab. 14, but stress that the uncertainty of an observable X is best 
described by the probability distribution p(X). In the simplest case, p{X) can be described 
by the 68% interval and the mode, but in general, it contains much more information as 
demonstrated in Fig. 8. 

Let us finally compare the SM predictions of observables based on the prior information 
with predictions based on posterior distributions as determined by experimental data and 
allowing also for NP in Cyg ^q. Our posterior findings are overlaid on the SM predictions 
for the examples in Fig. 8. Although NP contributions to the Wilson coefficients are 
included, in all cases the posterior distributions are narrower than the SM prediction based 
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on prior knowledge only. Obviously, the additional information from data on the nuisance 
parameters updates our knowledge on quantities (i3)[l,6] and (Fj;,)[l,6], which served as 
inputs to the fit. 

As described in the previous section, both fit solutions for g give an overall good 

description of the experimental data. For the optimized observable (^^°^)[1,6], they yield 
a prediction of similar range, shifted slightly towards larger values, compared to the SM 
prediction based on prior knowledge alone. The same situation emerges for the other 
optimized observables, which are free of form factor uncertainties — compare Tab. 7 and 
13 for low-g^ as well as Tab. 8 and 14 for high — and main uncertainties are due 
to lacking subleading corrections. At this stage, better prior knowledge on the nuisance 
parameters is needed. This will help to distinguish new physics from the SM with the 
help of optimized observables, in the scenario of the SM operator basis with real Wilson 
coefficients. However, any experimental observation outside of the predicted range would 
point strongly to an extended scenario. 

5 Conclusion 

We perform a fit of the short-distance couplings Cj g iq appearing in the effective theory of 
AB=1 decays describing 6 S7 and b ^ s££ transitions, assuming g^^g to be real valued. 
For the first time, we include all relevant theory uncertainties in the analysis by means 
of nuisance parameters. Measurements of exclusive rare decays B K*-f, B K^*"! U 
and Bg jlfj. obtained by CLEO, BaBar, Belle, CDF, and LHCb serve as experimental 
inputs. Besides presenting credibility intervals for the Wilson coefficients, we analyze the 
goodness of fit of the obtained solutions. For the best-fit solution, we show the pull values 
for all measurements in Figures 2 and 3. We use a novel combination of Markov Chain 
Monte Carlo and adaptive importance sampling methods in order to cope with the high 
dimensionality of the parameter space (~ 30) and the multimodal posterior distribution. 
With this approach, we can massively parallelize the costly evaluation of the posterior. 
Our results should simplify subsequent model-dependent studies; we are happy to provide 
the fit output in a suitable format upon request. 

The credibility intervals of the marginalized one- and two-dimensional posterior dis- 
tributions of Cjg If) are the main results of our fit, given in Tab. 5 and Fig. 1. Due to a 
discrete symmetry, a SM-like and a flipped-sign solution remain with posterior mass ratio 
of roughly 51% over 49%. Other local maxima exist, but their posterior masses are negli- 
gible. The SM values C|^^q are close to the best-fit point. Both solutions as well as the 
SM itself provide a good fit of the data. Judging by the Bayes factor as model comparison 
criterion, the data clearly favor the plain SM over a model with arbitrary real Cjqiq — a 
tribute to Occam's razor. Thus, from a purely statistical point of view, even the simplest 
model-independent extension of the SM is not necessary to describe the current data. We 
emphasize that the presence of the sign-fiipped solution still allows large NP contributions 
to the Wilson coefficients. However, the degeneracy of the observables does not allow us 
to distinguish them easily. This degeneracy is mildly broken by contributions of 4-quark 
operators, typically included in the effective Wilson coefficients C^g C^^. Assuming im- 
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proved theory uncertainties and current experimental central values in B ^ the fit 

suggests that the additional information on Cy^ enhances the SM-like solution over the 
flipped-sign solution. We expect a reduced theory uncertainty when including B Xg^. 

We provide updated predictions within the SM of selected observables in the angular 
distribution of i? K*(^ Ktt)U. Based on prior knowledge only, we obtain reduced 
theory uncertainties due the improved handling of uncertainty propagation, observing that 
the central values of previous analyses [57, 59, 66] are contained in the smallest 68% regions. 

Based on the fit output, we predict ranges for currently unmeasured observables that 
exhibit a reduced form factor dependence. Surprisingly, the predictions based on the fit 
output yield smaller ranges than SM predictions based on prior knowledge. The extra vari- 
ance due to Wilson coefficients is more than compensated for by the reduced uncertainties 
as the fit constrains some of the nuisance parameters and yields the correlation between 
all parameters. 

We observe that a fit with current B K ii constraints prefers smaller values of 
Cg than a fit with the B K*ii constraints. Including both sets of constraints, the fit 
accommodates this tension by shifting the B ^ K form factors towards smaller values, and 
the B K* form factors towards larger values. 

Future analyses can improve the fit by including results for the inclusive decays B 
Xgj as well as -B Xgii. Besides the inclusion of additional observables, further enhance- 
ments could arise when using an alternative parametrization oi B K* form factors; cf 
[114]. 
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A Numerical Input 

The numerical input values of parameters are listed in Tab. 9, for which the uncertain- 
ties have not been included since they are either very small or they enter in numerically 
subleading contributions to the observables of interest. 

The theory predictions of all the relevant semileptonic and radiative processes at large 
recoil are based on the QCDF results [38, 40]. These include the usage of the Light Cone 
Distribution Amplitudes (LCDA) of the involved kaons which are parametrized in terms 



-28- 



n ( M'7) 


0.11762 




777 


106 GeV 




OLp(mh) 


1/133 




pole 

t 


173.3 GeV 


[102] 

L J 




0.23116 


[91] 


Mw 


80.399 GeV 


[91] 


TB+ 


1.638 ps 


[91] 


tbo 


1.525 ps 


[91] 




5.2792 GeV 


[91] 


Mbo 


5.2795 GeV 


[91] 




0.494 GeV 


[91] 




0.498 GeV 


[91] 




0.892 GeV 


[91] 




0.896 GeV 


[91] 




1.472 ps 


[91] 


Mb. 


5.3663 GeV 


[91] 




0.485 GeV 






0.212 GeV 


[103] 


Ik 


0.1561 GeV 










/^*(2GeV) 


0.173 GeV 




Ik* 


0.217 GeV 




ai{K) 


0.048 




a2{K) 


0.174 




aiiKl) 


0.1 




a2{Kl) 


0.1 




«i(^||) 


0.1 






0.1 





Table 9. The numerical input used in the analysis. The mass of the strange quark has been 
neglected throughout, r^o (tb+) denotes the lifetime of the neutral (charged) B meson. The 
following parameters appear in expressions of i? ^ (if, K*)i£ at large recoil: Xb,+ denotes the 
first inverse moment of the _B-meson distribution amplitude, whereas Jm the decay constants and 
ai,2{M) are the first two Gegenbauer moments of the LCDA's of the respective Kaon states M = 
K, Kl, K*. 

of Gegenbauer moments a„(M) (M = K, K'l, K*^). In this work, we include terms in the 
expansion in Gegenbauer moments up to n = 2, using the central values in Tab. 9. 

Since the an{M) also enter the computation of the B K* form factors via LC sum 
rules [107], variation of the former would lead to double counting. Furthermore, the residual 
influence of the a„(M) on the observables is small compared to that of other parameters. 
We therefore do not vary the Gegenbauer moments. 

In addition, QCDF makes use of the decay constants {M = K, K*, which 
enter in numerically subleading contributions. The central values are listed in Tab. 9. 

B Nuisance Parameters 

In this section we present the nuisance parameters that are considered in this work and 
contribute the main uncertainties in theory predictions. All the priors of these parameters 
are clipped to the parameter range that corresponds to their respective 3cr interval. For 
the sake of readability, we categorize the individual nuisance parameters. 

B.l Common Nuisance Parameters 

The common nuisance parameters are those that enter most of the observables and are not 
specific for rare h ^ s decays. These are the elements of the Cabibbo-Kobayashi-Maskawa 
(CKM) quark-mixing matrix and the h and c quark masses. 
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A 
P 


0.804 ±0.010 
0.111 ±0.070 


[92] 
[92] 


A 

fj 


0.22535 ±0.00065 
0.381 ±0.030 


[92] 
[92] 


rricimc) 


{1.27Tol) GeV 


[91] 




{A.Wt'oiD GeV 


[91] 



Table 10. Common nuisance parameters: The CKM Wolfenstein parameter values as obtained 
from the CKM tree-level fit, cf. Sec. 3.2. 

For the purpose of the fit of rare 6 s decays, we take the CKM matrix elements from 
other observables such as tree decays. We parametrize the CKM matrix elements using the 
Wolfenstein parametrization to 0{X^) [108] and use the results of the tree-level fit of the 
UTfit collaboration [92] as priors in the fit of 6 ^ s decays. In this way, we include non-SM 
effects, but assume they do not affect tree-level decays. However, we use the results of the 
SM CKM fit in order to determine the uncertainties of observables in the framework of the 
SM in Sec. 4.2. Note that the CKM matrix elements only enter the branching ratios of 
B K^*^ + (7, i£) decays in the combination V^f^V^*. Although numerically negligible, the 
combination V^ijV*^ entering all observables is included in the analysis. It becomes relevant 
only for CP-asymmetric observables. All priors are Gaussian, with their 1 a ranges given 
in Tab. 10. 

The values of the quark masses nif, and nic enter most observables. In order to account 
for the asymmetric errors, we use LogGamma distributions (see Sec. D.2) as priors whose 
modes and 68%-probability intervals match the values given in Tab. 10. 

B.2 B ^ K^*'> Form Factors and fs. 

The heavy-to- light form factors /+,t,o for B K and V, ^0,1,2, and 71^2,3 for B K* 
transitions present a major source of uncertainty in predictions of rare exclusive B decays. 
They are functions of the dilepton invariant mass and we adopt the definition used in 
[40, 41, 48, 107]. Due to the application of form factor relations at large and low recoil, 
only /+ enters B ^ K and V and Ai^2 enter B K* transitions'^. The application of 
form factor relations introduces uncertainties of order Aqcd /nT-b that will be discussed in 
App. B.3. 

Currently, the form factors are only known from Light Cone Sum Rules (LCSR) which 
are applicable at low q^. Lattice QCD can provide results at high q^, where quenched results 
for some form factors [109, 110] are available and some preliminary unquenched results have 
been reported in [111-113]. An extensive discussion of the g^-shape parametrization using 
series expansion and a fit to low-g^ LCSR combined with high-g^ lattice results (when 
available) can be found in [114]. 

With regard to B ^ K* form factors V,Ai^2, we use the LCSR results at low-g^ as 
given in [107], where the extrapolation to high-g^ is based on a (multi-)pole ansatz 

2^^ + ! 2^^' (^-1) 

l-g^/m|j 1 - g^/mg^ 

''The form factors fo and Ao do not contribute within the framework of the SM operator basis, up to 
negligible terms suppressed by m^jq^ . 
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n 


r2 


m| [ GeV2] 


ml [ GeY'] 


V 


0.923 


-0.511 


5.32^ 


49.40 


Ai 




0.290 




40.38 


A2 


-0.084 


0.343 




52.00 



Table 11. The parameters of the form factors V and Ai^2- 



Ai = A2 = ^1 + ^2 

1-^2/^2^' l-q'^/ml {1-q^/miy 

and the numerical values of the parameters given in Tab. 11. We do not vary these pa- 
rameters themselves as they strongly depend on the LCSR analysis, but rather assign one 
multiplicative scaling factor Q per form factor (i = V,Ai,A2) to model the respective un- 
certainty such that the value Q - 1.0 corresponds to the central value of the form factor. 
A Gaussian prior is assigned to these nuisance parameters, which has a width of a - 0.15 
(i.e., 15% uncertainty) and its support extends up to 3 cr (i.e., maximally 45% uncertainty), 
outside of which the prior is set to zero (see Tab. 12). Note that in this way we do not vary 
the shape of the form factors. At large recoil, two universal form factors [40] appear 

^ Mb + Mk* " 2Ek* Mb 

and their variation is obtained by the uncorrelated variation of V and ^41^2 as described 
above. 

Since we calculate the B K*^ matrix element within QCDF for q^ - 0, all nuisance 
parameters that affect the process B K* ii in the large recoil region likewise affect the 
radiative process, as far as they are applicable. 

With regard to the B K form factor /+, we use the BCL parametrization [115] of 
the LCSR results [41] 



MO) 



1-q'^/M, 



2 

res,+ 



l^bt[z{q')-z{0)^\[z{qr-z{Of]) 



(B.3) 



z(.)^ ^;^'^^^ , ro = y^(v^-v/^^), T^^{MB±MKf. 

-S- -To 

This parametrization depends on the central value of the form factor at q"^ - 0, /+(0), and 
the slope parameter (and Mres,+ = 5.412 GeV). At large recoil, the dipole form factor 
/t is replaced by the large-energy universal form factor = f+ [48, 63] . At low recoil, the 
dipole form factor /t is substituted for by means of the improved Isgur-Wise relation [59] . 

In addition, we vary the decay constant fs^ of the Bs meson, since it constitutes the 
dominant uncertainty in the decay Bg flfi. The most recent lattice results [104, 105] 
have been averaged [106], yielding the number listed in Tab. 12. 

In order to assess the dependence of the fit on the choice of priors, we adopt two sets of 
priors. The first set reflects the uncertainties as reported by the authors of [41, 106, 107], 
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parameter 


central 


nominal 
1 (T support 


wide 

1 cr support 




1.0 


0.15 3cr 


0.45 3 a 


K 


0.34 
-2.1 


[0.32, 0.39] [0.28, 0.49] 
[-3.7, -1.2] [-6.9, 0.6] 


[0.28, 0.49] [0.0, 0.79] 
[-6.9, 0.6] [-10, 3.7] 




227.7 MeV 


6.2 MeV 3cr 


18.6 MeV 3 a 




1.0 


0.15 3a 


0.45 [0.0, 2.0] 


ko,i,||l, \rK\ 


0.0 


0.15 3cr 


0.45 3 a 



Table 12. Priors of the nuisance parameters of the B K'^*^ form factors, the Bg decay constant 
/b^, and parametrization of lacking subleading corrections at low {i = L,R and j = 0,1, ||) and 
high q^, specified for the nominal and wide set. All priors are Gaussian and we give the central 
value, its standard deviation cr, and the support of the prior. The nominal 1 a ranges of V and 
Ai,2 correspond to uncertainties quoted in [107], whereas, /+(0) and are taken from the LCSR 
analysis [41]; however, possible correlations among both are not available. 

thereby assuming the extrapolation of form factors to high has the same uncertainties 
as predicted by LCSR's at low g^. In the second set we triple the uncertainties. Both sets 
are given in Tab. 12. 

B.3 Subleading A/rrih Corrections 

There are several distinct sources of A/mij corrections arising in exclusive B -> K^*k£ 
decays. Here A is assumed to be of the order of the strong scale, however the particular 
physical meaning depends on the framework. When using power counting we use the 
generic value of 500 MeV. 

The first type is due to the form factor relations in the limit of heavy quark masses 
[46], which is valid for the whole g^-kinematic region. At the leading order in A/ mi,, they 
relate the B K* (B K) tensor form factors Ti^2,3 {fx) to vector V (/+) and axial- 
vector Ai^2 form factors^. This approximation receives a further numerical suppression 
due to Cj/Cg ~ 0(0.1). The additional large enery limit [47, 48] at low q^ allows us to 
eliminate another B -* K* form factor, introducing an additional subleading uncertainty 
not suppressed by Cj/Cg. Besides subleading corrections due to the use of form factor 
relations, the two distinct expansions in A/rub, QCDF at low and the OPE at high q"^, 
introduce a second type at the amplitude level, when truncating the expansion after the 
leading order in A /mi,. 

At low g^, QCDF (or equivalently SCET) provides a possibility to calculate such 
corrections, which are in general suppressed by a factor of A/rrib^. In principle, the partially 
known corrections [40, 81] could be included as an estimate of the lacking corrections, but 

^The authors [54] take the viewpoint, that such corrections can be accounted for at low q^, if form factor 
relations are not used in the leading-order contribution (in A/mt and as) to the amplitude. 
®In some subleading corrections one encounters infrared divergences [81, 116]. 
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here we model them by 6 real scale factors for each of the transversity amplitudes ^^^'n ^ 
in the case of 5 ^ K*U and one for B K These scale factors {i = L,R and 
j = 0,±, II) and Ck have Gaussian prior distributions each with central value 1 and aid 
range of 0.15 ^ A/mi, with a support up to 3 cr and include the subleading corrections due to 
form factor relations discussed above. Ala range of 0.45 ~ A/mj, with a support [0.0, 2.0] 
is chosen for the wide-prior scenario. 

At high q^, the interaction of the 4-quark operators and the electromagnetic current, 
which couples to the pair of leptons, might be treated within a local operator product 
expansion either in full QCD [45] or with subsequent matching on HQET [44]. In both 
approaches, subleading corrections to the decay amplitudes arise at (A/mf,)^ and asA/mi,, 
respectively, which are of similar numerical size. The additional suppression factor of A/m^ 
or Os, yields smaller theory uncertainties due to omission of subleading corrections at high 
in contrast to the low-g^ region. This is also not spoiled by the use of form factor 
relations [44, 57] for tensor form factors Ti^2,3 (/r) due to the accompanying numerical 
suppression by Cj/Cg, which depends on the new physics contributions. Note that for both 
approaches, full QCD and HQET, the subleading corrections are known in part, and in the 
future it is conceivable that they can be included completely. For example, the unknown 
subleading form factor arising in [45] could be calculated on the lattice. We follow [44], 
using as{mb) ~ 0.3. This gives rise to 3 complex ~ A/mb (a = 0,±, ||) for B -> K*ii 
[59] and one complex ~ A/mj, for B K £i [64], which are additive at the level of 
the amplitude. We treat the complex- valued subleading nuisance parameters with eight 
additional real-valued degrees of freedom, using Gaussian priors each with central value 0, 
a lo" range of 0.15 ~ A/m^, and a support up to 3 cr for its magnitude. The accompanying 
phases have uniform priors in [-7r/2, 7r/2]. A three-times-wider la range of 0.45 ~ A/nib 
and a support up to 3 cr is chosen for the wide-prior scenario. 

The choices are also listed in Tab. 12. 

C Standard Model Predictions 

In this appendix we provide g^-integrated SM predictions for measured and unmeasured 
observables, focusing on those low- and high-g'^ bins that are currently used in experimental 
analysis and are also accessible to theoretical methods. All quantities are CP averaged and 
lepton-mass effects have been taken into account using i - fi. The theory uncertainties 
have been obtained using the (nominal) prior distributions of the nuisance parameters. 
The results are listed in Tab. 13 and Tab. 14 for low and high q^. The central value 
corresponds to the mode and the errors to the smallest 68% interval of the probability 
distribution obtained with the Monte Carlo method. The value in parentheses is obtained 
when setting all nuisance parameters to the most probable prior value. 

At low g^, we do not predict Js^g and associated optimized observables Ai^'^^\ since 
they vanish at leading order in QCDF (including the as corrections), although we obtain 
non-vanishing values due to the implementation of subleading terms of kinematic origin 
(~ Mk*IMb). 
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Observable 


[2.0, 4.3] 


[1.0, 6.0] 


{Bk) X 10^ t 


0.85 ^O;! (0.81) 


1 Qc +0.54 

i.OO _g 28 


(1.75) 


{Bk*) X 10^ ^ 

1 T? \ 

{Fl} 


0.69 (1.05) 
0.055 ![J:[]i (0.086) 

n cr; +0.08 /n 'yoN 

0.85 _o 20 (0-"8) 


1 64 +1-80 
J^-"^ -0.83 

n 03 +0 07 
"•u-j -0.02 

n 01 +0.09 

U.»i _Q 22 


(2.46) 
(0.05) 
(0.73) 


( Jls) X 10^ 


1.18 (1.26) 


q AO +1.37 
-0.95 


(3.66) 


( Jlc) X 10^ 


0.31 (0.63) 


0-83 ^o:76 


(1.37) 


( J2,) X 10^ 


0.39 ![J:}^ (0.42) 


1 1 Q +0.45 
J^'-'-'J -0.31 


(1.21) 


( J2c) X 10^ 


-0.30 (-0.61) 


-0.79 lUl 


(-1.33) 


( J4) X 10^ 


/-I ^- -1-0 "-tQ / l—rl—r\ 

^■57 o'ii (0-77) 


1 n CO 
1 4^ +0.82 
'-■^'^ -0.62 


(1.82) 


W5/ X iU 


_n RQ +0.37 r|7\ 


1 r^/-» +0 88 

-1.80 


(-2.58) 


X 10^ 


84 fO 90") 


1 1 n +0.87 
1-19-0.74 


[i.Zi ) 


( J7) X 10^ 


2.52 (2.78) 

— i.Uu V / 


5 86 +o2o 


(6.21) 


( Js) X 10^ 


-0.89 tifj (-0.97) 


-1.79 

— -L .OU 


(-2.14) 




0.45 +012 (0.50) 


'^•^^ -0.08 


(0.47) 




0.63 (0.69) 


0.64 Til 


(0.71) 




0.41 (0.42) 


0.48 


(0.48) 




0.61 (0.54) 


0.29 Ti 


(0.25) 


(i/«) 


0.45 Tol (0.48) 


0.42 


(0.45) 


(<) 


-0.29 ^O;"! (-0.34) 


-0.29 ^0;°^ 


(-0.33) 



Table 13. SM predictions of g^-integrated observables at low-g^ in the bins e [g^in'^max] 

K'jln and ^B^ K*^Ji^. We list the mode and the smallest 68% interval of the probability 
distribution, along with the value obtained by the conventional method of setting all nuisance 
parameters to the prior modes (in parentheses). 

At high g^, J7,8,9 is zero at leading order in the OPE and when applying form factor 
relations, so is A^™\ Furthermore, we recall that Fi^ and A^''^^ become short-distance in- 
dependent [57] within the framework of the SM operator basis, and predictions are strongly 
dependent on the extrapolation of the form factor results from low obtained using LCSR. 

We do not predict Jgc since it vanishes in the absence of scalar and tensor operators. 

D Distributions 

D.l Amoroso Distribution 

Consider the posterior P{x\D), describing the search for a decay whose existence has not 
been established yet, with x representing the branching ratio. Suppose we know P{x\D) 
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Observable 


[14.18, 16.0] 


[> 16.0] 


[> 14.18] 


{Bk) X 10^ ^ 


0.39 (0.37) 

— u.uy V / 


0.73 T9 (0.68) 


1.11 Tt (1.04) 


{Fl) 


1 19 Cl 26") 

^•-"^^ -0.31 V^-^"^ 

-0.44 Tt (-0.44) 
0.38!o;o4 (0.36) 


1 41 fl 46") 

^.^^ -0.38 V-"^.^"^ 

-0.37 ![J [J? (-0.38) 
0.35!o;oi (0.34) 


2 57 +S §2 (2 72) 

-0.68 V^.'^/' 

-0.40 Tot (-0.41) 
0.36!o;ot (0.35) 


{Jls)x 10^ 


4.44 To (4-51) 


5.10!}:tf (5.44) 


9.70 (9.96) 


{Jlc) X 10^ 


3.23 (3.43) 


3.40 (3.72) 


6.64 i:I|(7.14) 


{J2s)x 10« 


1.48 (1.50) 


1.70 !°:tf (1.81) 


3.23 Tl (3-31) 


( J2c) X 10*^ 


-3.21 !}f (-3.41) 


-3.38 (-3.70) 


-6.61 (-7.11) 


( J3) X 10« 


-0.99 Tl (-1-11) 


-2.12 (-2.19) 


-3.06!}:!^ (-3.29) 


( J4) X 10« 


2.47 (2.65) 


3.10!J:0« (3.27) 


5.49 Tl (5.92) 


( J5) X 10^ 


-3.36 (-3.54) 


-2.95 To (-3.17) 


-6.23!}:f| (-6.72) 


{Jes)x 10^ 


-0.52 !0:}0(-0.55) 


-0.53 TA (-0.56) 


-1.05 ![J:i (-1.11) 




-0.38 ![l:}I(-0.37) 


-0.64 Tio (-0.60) 


-0.51 (-0.50) 


(4') 


1.45 (1.47) 


1.95 !Oio (2.01) 


1.67 (1.72) 




0.66 (0.67) 


0.48 Tio (0.48) 


0.56!0:}2 (0.57) 




085 CO 081 


111 fo log") 

u.iii -0.014 V".^"^^ 


123 CO ^2n) 

u.i^o_o.oi2 V".^^'-'/' 




-0.982 ^O;™ (-0.915) 


-0.777 !0:0|9 (-0.767) 


-0.843 TI7 (-0.834) 




0-9996 Tool (0-9996) 


0.9986 Tom (0.9986) 


0.9970 Toil (0.9969) 




-0.9844 Toll (-0.9853) 


-0.9719 Toll (-0.9722) 


-0.9748 Toil (-0.9751) 




-0.9837 !|]:[JJ]?| (-0.9845) 


-0.9614 ![j:[J[JJj' (-0.9618) 


-0.9606 Toil (-0.9613) 



Table 14. SM predictions of g^-integrated observables at high-g^ in the bins e [9min' 'Zmax] ^'^^ 
K^ftlj, and ■'■5° ^ K*^p,iJ,. We list the mode and the smallest 68% interval of the probability 
distribution, along with the value obtained by the conventional method of setting all nuisance 
parameters to the prior modes (in parentheses). 

at a number of of data points, (xi,Pi). Using the cumulative distribution function 

F{xa\D)^ f^^dxPix\D), (D.l) 

.7- 00 

we can determine the limit Xa at level a from F(xa\D) - a. For convenience, we seek an 
analytical expression g{x) interpolating the data points. We constrain g{-) by requiring 
that it vanish for negative branching ratios and that it yield the same 10[50,90]% limits 
as obtained from F(-\D): 

g{x<0)^0 (D.2) 
JJ'' dx g{x) ^ a, a = 0.1, 0.5, 0.9 . (D.3) 

We choose g{x) = Amoroso(x|/, A, a, /3). The Amoroso family [99] is a continuous uni- 
modal four-parameter family of probability distributions that easily accommodates the 
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constraints and provides an accurate approximation. Many well known distributions are 
direct members or appear as limits of the Amoroso family. Its functional form is 



Amoroso(x|/, A, a, /?) 



1 



(D.4) 



r(a) 

for X, I, X, a, (3 e M, a > 0, 
support X > Hf A > 0, < / if A < 0. 

We set the location parameter I to the minimum physical value, / = 0, and ensure that 
A > to satisfy (D.2). The scale parameter A and the shape parameters a and f3 are found 
by numerically solving the set of three equations (D.3). In the limit of a -> oo and (3-1, 
Amoroso( ■ ) converges to a Gaussian distribution [99] . 

D.2 LogGamma Distribution 

Consider a nuisance parameter u whose reported uncertainties are asymmetric, u - fit a, o. ^ 
b. In this case, we use the LogGamma distribution [99] to obtain a continuous prior over 
the given range of z/. The LogGamma family is a continuous unimodal three-parameter 
family of probability distributions 

LogGamma(i/|Z,A,a) = Y{a)\\\ '^^^ (" ("T") ~ (~\~)) ^^'^^ 

for i^, /, A, a e M, a > 0, 
support - oo < 1/ < CX3. 

The three parameters are uniquely fixed by demanding that the mode of -P(z^) be at 
that the interval [fi - a, fj, + b] contain 68%, and that the density be identical at yu - a and 
fi + b. More concisely, we have three conditions: 

argmax P{i^) - fi (D.6) 

V 

rfi+b 

/ duP{u) = 0.68 (D.7) 

P{l2-a) ^ P{n + b). (D.8) 

While the first constraint is used to fix the location parameter /, the scale parameter A 
and the shape parameter a must be extracted numerically by solving the coupled equations 
(D.7) and (D.8). For a finite range of say [i^min, i^max], the resulting density is normalized 
such that T"""" di^P(L') = 1. 

The asymmetry is governed by a: LogGamma( ■ ) approaches a Gaussian distribution 
in the limit a oo. 
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